Voting Massive Collections of Bayesian Network Classifiers for Data Streams

نویسنده

  • Remco R. Bouckaert
چکیده

We present a new method for voting exponential (in the number of attributes) size sets of Bayesian classifiers in polynomial time with polynomial memory requirements. Training is linear in the number of instances in the dataset and can be performed incrementally. This allows the collection to learn from massive data streams. The method allows for flexibility in balancing computational complexity, memory requirements and classification performance. Unlike many other incremental Bayesian methods, all statistics kept in memory are directly used in classification. Experimental results show that the classifiers perform well on both small and very large data sets, and that classification performance can be weighed against computational and memory costs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A fast incremental extreme learning machine algorithm for data streams classification

Data streams classification is an important approach to get useful knowledge from massive and dynamic data. Because of concept drift, traditional data mining techniques cannot be directly applied in data streams environment. Extreme learning machine (ELM) is a single hidden layer feedforward neural network (SLFN), comparing with the traditional neural network (e.g. BP network), ELM has a faster...

متن کامل

What Can We Do with Graph-Structured Data? - A Data Mining Perspective

Feedback in multimodal self-organizing networks enhances perception of corrupted stimuli p. 19 Identification of fuzzy relation model using HFC-based parallel genetic algorithms and information data granulation p. 29 Evaluation of incremental knowledge acquisition with simulated experts p. 39 Finite domain bounds consistency revisited p. 49 Speeding up weighted constraint satisfaction using red...

متن کامل

Mining multi-dimensional concept-drifting data streams using Bayesian network classifiers

In recent years, a plethora of approaches have been proposed to deal with the increasingly challenging task of mining concept-drifting data streams. However, most of these approaches can only be applied to uni-dimensional classification problems where each input instance has to be assigned to a single output class variable. The problem of mining multi-dimensional data streams, which includes mu...

متن کامل

Voting Principle Based on Nearest kernel classifier and Naive Bayesian classifier

This paper presented a voting principle based on multiple classifiers. This voting principle was based on the naïve Bayesian classification algorithm and a new method based on nearest to class kernel classifier that was proposed. The recognition ability of each classifier to each sample is not the same. A model of each classifier was obtained by the training on the train data, which acts as bas...

متن کامل

Bagged Voting Ensembles

Bayesian and decision tree classifiers are among the most popular classifiers used in the data mining community and recently numerous researchers have examined their sufficiency in ensembles. Although, many methods of ensemble creation have been proposed, there is as yet no clear picture of which method is best. In this work, we propose Bagged Voting using different subsets of the same training...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006